Understanding Taints and Tolerations in Kubernetes

Introduction

Taints and tolerations in Kubernetes provide a mechanism to control which nodes can accept which pods. They allow nodes to repel certain pods unless those pods explicitly tolerate the conditions, ensuring that workloads are placed appropriately within a cluster. This feature is especially useful when you want to dedicate nodes to specific workloads, like critical services or isolated environments.

What Are Taints?

Taints are applied to nodes to prevent pods from being scheduled unless they have matching tolerations. A taint consists of a key, a value, and an effect. The effect defines what happens to pods that do not tolerate the taint:

NoSchedule: Pods that don’t tolerate the taint won’t be scheduled on the node.
PreferNoSchedule: The scheduler tries to avoid placing a pod on the node but might still place it there if no better option is available.
NoExecute: New pods won’t be scheduled, and existing pods will be evicted from the node.

Example of Adding a Taint to a Node

kubectl taint nodes node1 key=value:NoSchedule

This command taints node1 with a key-value pair of key=value and an effect of NoSchedule. Pods without a matching toleration will not be scheduled on this node.

What Are Tolerations?

Tolerations are applied to pods and define the conditions that a pod can tolerate. By adding a toleration to a pod, you allow it to be scheduled on nodes with matching taints. However, it is important to note that tolerations do not guarantee that the pod will be scheduled on the tainted node; they only allow it to be scheduled there.

Key Point: Tolerations Don’t Guarantee Pod Placement

While tolerations allow a pod to be placed on a tainted node, they do not force it. Kubernetes uses a scheduler to decide where to place the pod based on various factors like resource availability. If multiple nodes (both tainted and non-tainted) can accept the pod, the scheduler might place the pod on any suitable node.

Example of Adding a Toleration to a Pod

apiVersion: v1
kind: Pod
metadata:
  name: my-pod
spec:
  tolerations:
  - key: "key"
    operator: "Equal"
    value: "value"
    effect: "NoSchedule"
  containers:
    - name: my-container
      image: nginx

This pod has a toleration for the taint with key=value and NoSchedule effect, allowing it to be scheduled on any node with that taint.

Understanding the Taint Effects

There are three main values for the effect field in taints. Each has a specific impact on how Kubernetes schedules or evicts pods:

NoSchedule: This prevents pods that don’t tolerate the taint from being scheduled on the node. If a pod lacks the required toleration, the scheduler will skip the node.
PreferNoSchedule: This is softer than NoSchedule. The scheduler will try to avoid placing pods on the tainted node, but it may still do so if no better options are available. It’s a preference, not a strict rule.
NoExecute: This is the strictest effect. Not only does it prevent new pods without tolerations from being scheduled on the node, but it also evicts existing pods that don't tolerate the taint. This is typically used in situations where you want to completely clear the node of certain workloads.

Example of Node Maintenance with NoExecute Taint

When performing node maintenance, you might want to evict all non-critical workloads. Applying a NoExecute taint will remove non-tolerant pods from the node.

kubectl taint nodes maintenance-node key=value:NoExecute

This will evict any pods that don't tolerate the key=value taint from maintenance-node.

Use Cases for Taints and Tolerations

Dedicated Nodes: Taints can be used to dedicate nodes for certain critical workloads or isolated environments.
Node Maintenance: Applying a NoExecute taint during maintenance ensures that pods are evicted and won’t be scheduled on the node.
Resource-Specific Workloads: Use taints to reserve nodes with specialized hardware, such as GPUs, for appropriate workloads.

Example: Dedicated Node for Critical Workloads

Suppose you have a node dedicated for critical workloads. You can taint the node and add tolerations to the critical pods.

Taint the Node:

kubectl taint nodes critical-node app=critical:NoSchedule

This command taints the critical-node so that only pods with a toleration for app=critical can be scheduled there.

Add Toleration to Critical Pods:

apiVersion: v1
kind: Pod
metadata:
  name: critical-pod
spec:
  tolerations:
  - key: "app"
    operator: "Equal"
    value: "critical"
    effect: "NoSchedule"
  containers:
    - name: critical-container
      image: critical-app-image

This pod can now be scheduled on critical-node because it has the matching toleration for the taint.

Auto-Applied Taints on Master Nodes

By default, master nodes are automatically tainted to prevent general workloads from being scheduled on them. The taint prevents any pod that doesn't have a matching toleration from being scheduled. This ensures that the master nodes are reserved for control-plane components and are not overloaded with unnecessary workloads.

Viewing Master Node Taints

kubectl describe node master-node | grep Taints

This command allows you to view the taints on the master node. You will typically see a taint like node-role.kubernetes.io/master:NoSchedule, which prevents workloads from being scheduled on the master node.

Conclusion

Taints and tolerations are vital tools in Kubernetes for controlling pod placement across nodes. They allow you to define restrictions and dedicate specific nodes for certain workloads. However, it's important to remember that tolerations do not guarantee pod placement on a specific node; they only allow the pod to be placed there. Kubernetes scheduling decisions are based on multiple factors, and the taint-toleration mechanism is one part of that process.